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Abstract 

One of the fundamental problems in distributed computing is how to efficiently perform routing in 
a faulty network in which each link fails with some probability. This paper investigates how big the 
failure probability can be, before the capability to efficiently find a path in the network is lost. Our main 
results show tight upper and lower bounds for the failure probability which permits routing, both for 
the hypercube and for the d~ dimensional mesh. We use tools from percolation theory to show that in 
the d— dimensional mesh, once a giant component appears — efficient routing is possible. A different 
behavior is observed when the hypercube is considered. In the hypercube there is a range of failure 
probabilities in which short paths exist with high probability, yet finding them must involve querying 
essentially the entire network. Thus the routing complexity of the hypercube shows an asymptotic phase 
transition. The critical probability with respect to routing complexity lies in a different location then 
that of the critical probability with respect to connectivity. Finally we show that an oracle access to 
links (as opposed to local routing) may reduce significantly the complexity of the routing problem. We 
' demonstrate this fact by providing tight upper and lower bounds for the complexity of routing in the 

00' random graph G„ lP . 
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1 Introduction 



The goal of this paper is to investigate the effectiveness of routing in faulty networks. Suppose that a 
network is represented by a graph G. Two kinds of fault models are common in the theoretical literature: 
Worst case faults and random faults which are our concern. In the random fault model it is assumed that 
each component of the network fails with some probability and independently of all other components. In 
this paper we consider edge failures so we assume each edge in G fails independently with some probability 
q = 1 — p. Let G p be the resulting graph. One can ask what is the probability that two nodes u and v remain 
connected in G p . This had been the focus of much research concerning the existence of giant components 
in such graphs, and the critical values of p for the existence of those, cf. lQ|30l|20]|2]|23l. But in many 
applications the fact that a path between u and v exists is not sufficient, one wants to be able to find the path 
in a distributed manner. 

It is known that if the topology of a graph has some randomness, then the existence of short paths in a 
graph does not guarantee the ability of efficiently finding them. For instance a cycle with a random matching 
has a logarithmic diameter [6 1, yet paths connecting a given pair of nodes can not be found in less than y/n 
time [21 1. This phenomenon is especially acute when considering 'natural' networks such as the world wide 
web, social networks, P2P networks etc, in which typically the network size is huge, the diameter of the 
network is small and the challenge is to find short paths within a time complexity that is comparable to the 
diameter. Indeed Kleinberg's model of the small world phenomenon [21 22 1, is aimed at explaining the 
ability to find short paths in social networks (and not merely their existence). In the context of P2P several 
randomized topologies were proposed along with routing algorithms that find short paths in the random 
graph cf. J5][^]|26 1. In the context of P2P networks, many routing algorithm are able to find paths between 
nodes even when nodes or links fail cf. fT8l l29l 1321 . While our findings do not apply directly to these 
networks, we expect that are main result would hold for them as well. See Section [T31 for more details. 

In this paper we analyze the algorithmic complexity of finding a path between nodes iz, v in Gp as a 
function of the failure probability. In particular we seek to find the exact values of p for which it is possible 
to perform routing in G p within time complexity that is comparable to the diameter. One difficulty is that 
with positive probability u, v are in distinct components of G p . We therefore restrict our attention to the 
case where a giant component exists, and condition on the event that u, v are connected. 

Our findings present a complex picture. We show that for some graphs, as the d— dimensional mesh, 
efficient routing is always possible, i.e. it is easy to find with high probability short paths between nodes 
within the giant component (whenever it exists). However, for other graphs, such as the hypercube, efficient 
routing is possible only for some failure probabilities. In other words, there is a range of failure probabilities 
for which with high probability a giant component exists, the diameter of the giant component is small, yet 
in order to find a path between nodes it is necessary to probe a large portion of the graph. We provide tight 
upper and lower bounds on the routing complexity, indicating the exact location of the transition. 

1.1 The Model 

Definition 1. Given a graph G p and two vertices u, v, a routing algorithm is an algorithm that is allowed to 
probe whether an edge exists in G p , and outputs a path between u, v if such exists. A routing algorithm is 
said to be local, if the first edge it probes is adjacent to u and subsequently it probes only edges to (an end 
point of) which it has already established a path from u. 

Local algorithms aim to capture the realistic constraints of routing in a network. If each node is a server 
in a network and u wishes to send a message to v then u must find a path to v while probing edges it has 
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already reached. In Section |5] we show that a local router may require exponentially more probes than a 
non local one, thus the distinction between the two kinds of algorithms is necessary. A non local routing 
algorithm may be referred to as an oracle routing algorithm. Denote by {u ~ v} the event that u is indeed 
connected to v. 

Definition 2. Given a graph G, probability p and a routing algorithm A, the routing complexity of A denoted 
by comp(A), with respect to the nodes u, v, is the random variable that counts the queries A makes (i.e. 
edges probed) to find a path between it, v in G p , conditioned on {u ~ v}. 

The routing complexity measures how many probes are needed to route a message from u to v in G p , 
assuming this routing is possible. We do not consider here any computations that the algorithm performs. 
As indicated above, the question is most interesting when Pr[u ~ v] is bounded away from zero, and indeed 
we limit our discussion to this case. A simple upperbound on the routing complexity could be achieved 
by performing a BFS search on G p . In terms of the routing complexity this is tantamount to probing the 
entire graph. However there may exist algorithms which achieve a much smaller routing complexity. In 
particular, if the diameter of G is small, we are interested in finding a routing algorithm with a complexity 
that is comparable to the actual distance between the nodes, or show that none exists. 

We stress that the routing complexity measures the expected complexity of finding a path between 
two specified vertices, and does not necessarily indicate the difficulty of performing a full blown routing 
scheme. Small routing complexity may be seen as the minimal requirement of fault tolerance in networks. 
Naturally such a weak requirement strengthens our lower bounds and weakens our upper bounds. In this 
paper we focus on analyzing the hypercube and the d— dimensional grid, which are probably the most widely 
investigated topologies in this context. 

1.2 Related Work 

Denote by H n>p the n— dimensional hypercube, when each edge is deleted with probability 1 — p and survives 
with probability p. Random subgraphs of the hypercube had been the focus of much research. It is known 
(see eg. fTTI ^ that if p < | then with high probability H UjP is not connected and if p > | then with 
high probability H UjP is connected. A classic result by Ajtai, Komlos and Szemeredi [ 1 1 states that if 
p > re (1 + e) for any fixed e > then with high probability 1 H n , p contains a giant component (i.e. a 
component with 0(2 n ) nodes), while if e < then w.h.p a giant component will not exist. This result was 
sharpened by Bollobas et al in Q and then by Borgs et al in [ 8 1. 

A related notion to routing complexity is that of emulation. Roughly speaking, network A emulates 
network B if A can perform any computation B performs with a constant slowdown. When the emulating 
network is a random subgraph the notion of emulation implies not only that short paths could be found but 
also that they do not create bottlenecks in the computation. Hastad et al lTT6l PPTl considered node failures, 
and showed that if p is a constant close enough to 1, then H njP could emulate H n with a small slowdown. 
Cole et al \ 10 1 proved that a faulty butterfly network can perform efficient permutation routing even if each 
node or edge fails with some constant probability. Emulation under worst case faults were considered by 
Leighton et al [24|. In particular these results imply that if the failure probability is small enough then it 
is possible to find paths efficiently between nodes in the giant component. On the other hand Angel and 
Benjamini showed that if p < -4= then the hypercube could not be embedded in its giant component 

with constant distortion. This result suggests that when ^ < p < -7= then even though a giant component 
exists w.h.p, it defines a metric that is different then that of the hypercube. 

'Throughout this paper, the term 'with high probability' means with probability that tends to 1 as n — > oo. 
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Let M d be a d-dimensional mesh with M d nodes, in which each edge is deleted with probability 1 — p. 
It is known that for each d there exists a critical probability p d such that if p < p d c then w.h.p. there will not 
be a giant component in M d and if p > p d c then w.h.p. M d will contain a giant component. The exact values 
of the critical probabilities are not always known. It is known that p\ = \ and that p d = (1 + o(l))/2d 
and is decreasing in d. See the book by Grimmett [ 12 1 and the references therein. Kaklamanis et al [19] 
showed that if p is large enough then Ml can emulate M d with 0(log n) slowdown. Mathies [27 1 extended 
this result for any p > p\ = \. These results do not imply an efficient routing algorithm. Naor and Wieder 
[28 1 used planar duality to prove that efficient routing is possible in Ml whenever p > i. Cole et al 
proved that a two dimensional array can tolerate a constant fraction of worst case faults and still emulate the 
non faulty array with a constant slowdown. 

1.3 Summary of Results 

Recall that H rhP denotes a random subgraph of the n— dimensional hypercube obtained by selecting each 
edge independently with probability p. As mentioned it is known that when p > (1 + e)n~ l with high 
probability a giant component exists (Q). Furthermore it is implicit in the proof, that the diameter of the 
component is polynomial in n. If however p < rT x (1 — e) then the size of the largest connected component 
is o(2 n ) w.h.p. This suggests that efficient routing in the giant component of H niP might be possible for any 
P > n^ 1 (1 + e). Our work shows that this is not the case: there is a threshold for efficient routing that lies in 
a different location. The following theorem provides a complete characterization of the routing complexity 
as a function of the failure probability: 

Theorem 3. For a fixed a, let p = n~ a . 

(i) If a > 1/2 + [3 for (3 > 0, any local routing algorithm in the hypercube H n , p , makes at least 2^ n > 

queries w.h.p.. 

(ii) There is a local routing algorithm A on the hypercube such that the following holds: For any a < 1/2 
there exists k = k(a) so that for any two vertices, comp(A) < n k with probability 1 — exp(— cn 1_Q ) 
for some constant c > 0. 

Thus for p = n~ a for i < a < 1, an intriguing phenomenon occurs: the giant component of H UjP 
shares some structural properties of H n , in particular it has diameter poly(n) (w.h.p.), and has roughly the 
same expansion of H n , yet the ability to find short paths is lost. Angel and Benjamini proved in [3 1 that for 
these failure probabilities H n could not be embedded in H n p with constant distortion, so the result of Part 
(i) is not entirely surprising, yet the techniques we use are different then that of [3|. Part (i) is proven by 
showing that if p is small enough then a ball centered at v is likely to look more or less like a tree rooted 
at v, which contains closed edges. Now, in order to reach v from u it is necessary to find a leaf which is 
connected to v via an open path, an event which is proven to be rare. Our technique is general enough to be 
used on other families of graphs. 

It is proven in [3| that if a < 1/2 then there is an embedding of H n in H n p with constant distortion. 
This embedding is used to derive the matching upperbound of part (ii). Note that the algorithm of (ii) does 
not depend on a, and only its efficiency changes. Therefore if a = 0, i.e. there are no faults, then the 
algorithm reduces to a greedy algorithm which routes along the hypercube's shortest paths. 

Many popular P2P topologies share some structural similarities with the hypercube cf. l32l l5l l3Tl . We 
did not prove that Theorem |3] holds for these topologies, yet it is reasonable to assume that that this is 
the case. If so then the Theorem implies that if the network suffers many faults, flooding and gossiping 
techniques would remain efficient means to locate data (in terms of latency) while the routing based exact 
search algorithms fail. 
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The phenomenon described in Theorem |3] does not occur in all graphs. Recall that M d denotes a d- 
dimensional mesh with M d nodes, in which each edge is deleted with probability 1 — p. 

Theorem 4. Let u,v be two vertices at distance n in M d . There exists a local routing algorithm in M d so 
that if p > p d then the expected routing complexity is 0(n). 

Thus in the mesh when p is large enough so that a giant component appears it is possible to find paths 
between two vertices in time comparable to the distance between the nodes. It is important to note that if 
we allow p to be close enough to 1 then Theorem[4]is fairly easy prove. The main difficulty is proving that 
the Theorem's statement is correct for any p > p d , this involves some deep results from Percolation Theory. 

The previous two theorems assumed the routing algorithms are local, i.e. they are only allowed to probe 
edges for which they have already established a path. What if we remove the locality assumption and allow 
the routing algorithm to probe any edge in the graph, we call this model oracle routing. On first glance it 
might seem as if oracle routing may not change considerably the routing complexity. Yet, in Section |5] we 
show a graph in which there is an exponential gap between the routing complexity with respect to oracle 
routing and that of local routing. We also provide tight upper and lower bounds for routing in G UtP , and 
show that in this natural model oracle routing outperforms local routing. 

In the next section a lemma which provides a lower bound for routing complexity in a general scenario 
is proved. In Sections |3] and |4] we prove our results for the hypercube and the mesh respectively. Section |5] 
concerns the oracle routing model. Finally Section[6]discusses some related open problems. 



2 The Lower Bound Lemma 

In this Section we prove a lemma which is instrumental in proving hardness of local routing on various 
graphs. The basic intuition could be seen through the following example: Consider a graph in which there 
are exactly d edge disjoint paths of length 2 between nodes u, v. Now assume each edge remains open 
with probability A=. We expect that both u and v would be connected to about yd open edges, thus by the 
birthday paradox w.h.p there would be an open path of length 2 between the nodes. Assuming such a path 
exists, it is easy to see that Q(d) edges should be probed w.h.p before one of these paths is found. 

This intuition could be generalized as follows: If S is a subset of the nodes, and v G S while u G" S, 
then a path from u to v must at some point find an edge in the cut (S, S) which is connected to v. If the 
probability that an edge in the cut is connected to v via the set S is low enough, then many such edges should 
be probed before a path is found. More formally: for a set S and vertices u, v G S we write {(x ~ y) G S} 
for the event that x is connected to y by an open path in the set S. Similarly {(x ~ e) G S} denotes the 
event that x is connected to an end point of the edge e via a path in the set S. 

Lemma 5. Let V = S U S be a disjoint partition of the vertex set of a graph and v G S a vertex. Assume 
for any edge e crossing the cut (S, S) we have Pr[(t> ~ e) G S] < r), and let X be the number of queries 
made by a local routing algorithm from u to v, then 

Pr[x < „ < gdgjSi^OM . (i, 

Pt[[u ~ v)\ 

IfuGS then the numerator becomes trj. 
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Proof. If {{u ~ u) € 5) c (which is always the case if -u € 5), then any path from u to v crosses the cut. 
Finding a path from u to v involves finding a path in 5 from some edge of the cut to v. Each probed edge 
of the cut has probability at most rj of being in such a path. For any set of t edges in the cut (S, S), the 
probability that one of them is connected to v in S is at most tr]. Thus the probability of finding a path from 
u to v by probing only those t edges of the cut is at most trj + Pr[(u ~ v) E S]. 

Since this bound is uniform in the set of edges, the fact that edges may be chosen adaptively does not 
invalidate the bound. Similarly, the constraint of only probing edges reachable from u only reduces the 
possible sets of queries. Finally, the denominator stems from conditioning on {u ~ v} (in the complexity 
definition). □ 



2.1 An Illustrative Example 

In following we use Lemma|5]to lowerbound the routing complexity of the double binary tree. The double 
binary tree of depth n denoted TT n is constructed by taking two binary trees of uniform depth n and 
identifying their corresponding leaves. Let x, y be the two roots of the trees. First we identify the failure 
probabilities for which x, y are connected with probability which is bounded away from 0. 

Lemma 6. If < p < 1 then there exists a path between x and y in TT niP with probability bounded away 
from 0. If p < then w.h.p. no such paths exists. 

Proof. In order for a path to exist between x and y, it must be the case that there exists an open branch 
from a leaf w to the root of the first tree x, and that the mirroring branch from w to y is also open. This is 
equivalent to the case of a single tree where each edge is open with probability p 2 . It is well known that the 
critical probability of a Galton Watson tree (or the binary branching process) is i. See for instance [ 14 1 for 
details. □ 

Now we show that for any p < 1, the local routing complexity is exponential in the diameter of the 
graph. 

Theorem 7. Let -4= < p < 1. For some c > and any a, n, any local router between the two roots ofTT n 
makes at least ap~ n queries with probability at least 1 — ca. 

Proof. Apply Lemma|5]with S being the second tree to get the desired bound: Clearly we may have r] = p". 
The nodes x and y can be connected only via the cut (S, S), this happens with probability at least c(p). 
Lemma|5]now implies 

Pr[A < „-] < mZf = JL 
c(p) c[p) 

□ 

If we set a to be a decaying function (say -) then the probability a local router would probe less than 

2 — is O(^). The double binary tree has the interesting property that an oracle routing algorithm may find 
a path between x and y with a polynomial number of probes. See Section|5] 



3 Hypercube — Tight Upper and Lower Bounds 

In this section we show the exact location of the probability p, in which the routing complexity shows a 
phase transition between being exponential and being polynomial (in n). The idea is to show that when 
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p < -j= then balls around nodes look more-or-less like trees, and therefore when trying to reach node v, 
a routing algorithm would need to 'penetrate' a tree through its leaves, as was demonstrated in the double 
binary tree graph. When p > -4= then there are enough edges so that some variant of greedy routing will 
find a path within polynomial time. 



3.1 The Lower Bound 

Here we apply Lemma [5] to get a lower bound on the local routing complexity of the hypercube when 
p < rT 1 ! 2 . The given bound translates to a fractional exponential (in n) bound on the routing complexity. 

Proof of Theorem^i). We apply Lemma|5]to the hypercube with S being a ball of radius I = n 13 around y, 
for some < (3 < a - 1/2. 

The first stage is to bound i] of the lemma, i.e. bound the probability v is connected to an edge on 
the boundary of S via a path within S . We show that for large n, for any e connecting S and S we have 
Pr[(u ~ e) G S] < rj holds with rj = 2n^~ a ^ nl3 . Let x be the endpoint of e in S with d(x, v) = I. Consider 
a path from v to x in S as a sequence of coordinates in which consecutive steps are taken. Let be the set 
of such paths of length I + 2k (by parity this catches all paths). 

For k = we have \Aq\ = 11 since a path of Aq uses each of the I coordinates exactly once. To bound 
\Ak\ we show a map from A^ to A^-i that maps at most n • I 2 paths to each path. Existence of such a map 
implies \A^\ < nl 2 \Ak_i\ and therefore by induction \Ak\ < n h l 2k l\. To define the map, consider the first 
I + 1 steps of a path. Since the path remains in the ball S, at least one of the coordinates is repeated. Take 
such a repeated coordinate and eliminate its first two occurrences. It is easy to see that this maps a path 
p G Ak to a path p' G Afc-l- To reconstruct p from p' one needs to know which coordinate was removed 
(n possibilities) and the indices at which it appeared (('I, 1 ) < I 2 possibilities). Thus the pre-image of p 1 
contains at most nl 2 paths from A^. 

This bound clearly counts many paths more than once, as well as many non-simple paths, but it is good 
enough. Each simple path in A^ is open with probability p l+2k , and so 

oo 

Pr[(« ~ x) G S] < Y,P l+2k n k l 2k ll 

k=0 

<(lp) l Y(nl 2 p 2 ^- H 



y ' l_ n 2f3+l-2a^ 

k=0 

For large n the denominator is close to 1, hence rj = 2n^~°^ n& is a valid choice. 

Next, we estimate the other terms in Q. Since each of u and v is in the giant component with probability 
tending to 1, Pr[(u ~ v )] — > 1. If u G" S then Pr[(u ~ v) G S] = 0. Otherwise, suppose d(u, v) = m < I. 
The same argument as above shows that the number of paths in S of length m + 2k from u to v is at most 
m\{nl 2 ) k and hence the probability that any of them are open satisfies 

^ I 777 

m+2fc fc/2fc . _ m -P 



Pr[(« ~ v) G S] < ^2p m+2k n k l 2k m\ = - 



n 2/3+l-2o ' 

k=0 



The denominator tends to 1 and for m < I, the numerator is o(l) because mp < Ip = n 13 a . 

Using Lemma|5j we now see that if the complexity of a local router in the hypercube is A, then 

Pr[A < n(«HV/ n] < 2/n + Pr[(^ ,) G 5] _ Q 

Pt[(u ~ v)\ 
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□ 



3.2 The Upper Bound 

Next we show that when p is large, local routing on the hypercube may be performed using n k probes with 
high probability This is a variation on the result of [ 3 1 showing that in this regime the metric distortion of 
the percolation is bounded. This shows that there is indeed an asymptotic phase transition in the complexity 
of routing on the hypercube. The proof below shows that k = 0((1 — 2a)~ 1 ), though it would be interesting 
to know the exact dependence of k on a (the optimal k need not be integral). 

Proof of Theorem&ii). Here, the terms neighbor and distance relate to the metric of the hypercube before 
percolation. Percolation neighbor and percolation distance are used for the percolated hypercube H n>p . 

We refer to the definition of a good vertex from |3j|, which roughly means having a high degree in H n ,p- 
The condition that a vertex is good is determined by the neighborhood of percolation radius 2 around it. In 
131 . Section (2) the following is proved: 

(1) Any given vertex is good with probability 1 — exp(— cn 1_a ). 

(2) With probability 1 — exp(— ere), all pairs of good vertices at distance up to 3 have percolation distance 

at most I for some I = 1(a) = 0((1 — 2a) _1 ). 

Now the algorithm is straight forward. Pick arbitrarily a path from u to v, of minimal length: u = 
uq,ux, ■ ■ ■ , u m = v, and use BFS iteratively to find a path from m to Ui + \. With probability tending to 
1 all the vertices of the path, including u and v are good (each one is not good with probability at most 
exp(— cre 1_Q ), there are at most n vertices in the path). On this event, the percolation distance between Ui 
and Uj+i is at most / and a path from U{ to Ui+% can be found by, say, BFS of complexity n l . The total 
complexity is at most n l+1 . □ 

Remark: A natural approach would be to use greedy routing, i.e. at each routing step, probe edges that 
reduce the Hamming distance to the target. While this strategy may work most of the way, in the final steps 
a more extensive search is required. It may be the case though that a greedy approach at the early stages of 
the routing would reduce the exponent in the complexity of the algorithm. 

4 The Mesh — Upper Bound 

In this section we show that the phenomenon observed for hypercubes does not apply when the mesh is 
considered, i.e. whenever a giant component exists, it is possible to efficiently route between nodes. Con- 
sider a cube of the d-dimensional mesh, i.e. a submesh with M nodes, and let each edge remain open 
with probability with some fixed p, and be closed with probability q = 1 — p. Let d(-, •) denote distance 
in the mesh, and D(-,-) denote the distance in the giant component (which may be referred as percolation 
distance). We seek a path between two vertices u, v in the cube with d(u, v) = re (the cube size is M which 
may be much larger than re). We are interested in the routing complexity in terms of n when p is fixed. As 
mentioned, there exists a number p d c such that if p < p% then Pr[u ~ v] = o(l) as n — > oo, so hereinafter 
we assume p > p d c . For such p there is a giant cluster in the cube, and with probability bounded from 0, both 
u and v are in the giant component and therefore connected. 
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We give an algorithm that efficiently finds a short path from u to v. The case of d = 2 was solved 
by Naor and Wieder in [28 1, where planar duality is used to show that in a two dimensional grid with n 2 
vertices, the routing complexity is 0(n) w.h.p. It is important to note that it is fairly easy to find a path 
between u, v if we assume that p is sufficiently close to 1. The main difficulty is pushing the probability p 
all the way down to p d c . In order to do that we need some fairly recent and strong results from Percolation 
Theory. 

4.1 The Routing Algorithm 

The idea of the algorithm is as follows. Consider n vertices which belong to some shortest path between 
u,v. With high probability many of them are in the giant component and the percolation distance between 
them is not too large. The algorithm searches around each of them, until the next one is found. More 
formally: 

1. Fix u = Uq, u\, U2, ■ ■ ■ , u n = v to be a shortest path between u and v. Start from no = u. 

2. Assume m has been reached. Exhaustively probe edges around m (using say BFS) until some vertex 
Uj with j > i is reached. 

3. Repeat at most n times until reaching u n = v. 

Note that a BFS up do distance k from a vertex takes only 0(k d ) queries since only edges of the mesh 
at distance k from the starting point may be reached. The key point is that it is very unlikely at any iteration 
that a large depth is needed. Correctness of the algorithm is clear since the search at each stage stops once a 
closer approximation to v is found. If an open path from u to v exists, then some path will be found. 

Proof of Theorem^ Let Ui be some vertex along the chosen path to v that is in the giant component. Let 
Uj be the next vertex along the path in the giant component. It follows that the j — i — 1 vertices along the 
path between them are outside the giant component, an event that is exponentially unlikely (see [ 12 1): 

Pr[|j - i\ > k] < e~ cik for some c\ = c\(p) > 0. 

Note that j always exists since v is assumed to be in the giant component. In practice, Uj might be 
skipped over by the algorithm if some further vertex Uk is reached first. If the algorithm explores a neigh- 
borhood of Ui, it finds a further vertex of the path at distance at most D(ui,Uj). Thus to bound the number 
of queries the algorithm makes to reach some U}~ we use the following Lemma, which is a proper restatement 
of result by Antal and Pisztora I41 IT31 . 

Lemma 8. For any p > p d and any x,y in a cube M d of the infinite mesh, let D(x,y) be the percolation 
distance (in M d ) between them. For some p,C2 > depending only on the dimension and p, and for any 

a> p - d(x,y)) 

Pi[(D(x,y) > a) A (x ~ y) £ M d ] < e~ C2a . 

Either d(u u ,Uj) is large or it is small. In the latter case, D(m,Uj) is unlikely to be large, and the former 
case is itself unlikely: 

Pr[D(m,Uj) > k] < Pr[d(ui,Uj) > k/p] + Pr[(d(m, Uj) < k/p) A (D(ui,Uj) > k)] 

< e -cifc/P + e -C2fc < e -c 3 k^ 

Consequently, if Aj is the number of queries made from itj, 

Pv[Ai >k]< Pv[D{ui,Uj) > ck 1 ^} < e~ Cikl/d . 
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Since this is summable, for each vertex m of the path that is in the giant component the expected work to 
get from itj to a further vertex is 0(1). 

The number of queries made by the algorithm is at most the sum over all vertices of the path in the giant 
component of the work to progress from them (actually it is less since some may be skipped over, and some 
queries may be duplicated). By additivity of expectation, A4] = 0(n). □ 

5 Oracle Routing vs. Local Routing 

In this section we consider routing algorithms that are allowed to query any edge, and not just edges to which 
it has established a path. This is called oracle routing. Surprisingly, it might be the case that a huge gap 
exists between the complexity of local and oracle routing. A simple (yet somewhat artificial) example for 
this is the double binary tree TT n with fixed ^/l/2 < p < 1. In section |2] we showed that any local routing 
algorithm which finds a path between the two roots of TT n w.h.p. makes exponentially many queries. The 
following theorem shows that oracle routing algorithms can do significantly better. 

Theorem 9. There is an oracle router between the two roots ofTT n with average complexity cnfor some 
c = c(p) < 00 and any p > ^/l/2. 

Proof. A simple path between the two roots is just a branch up to level n in the first tree joined to the 
corresponding branch in the second tree. The oracle router is very simple: To find a path from the root to 
level n that is open in both trees, query edges together with their corresponding edges in the second tree. 
Each such pair of edges is open with probability p 2 > 1/2. The problem is equivalent to finding a path 
from the root to level n in a super-critical Galton-Watson tree, and a depth first search accomplishes this in 
expected complexity linear in n. 

To see this, observe that any branch of the infinite binary branching process (the Galton-Watson tree) 
that fails to reach level n has expected size c(p) (and is in fact exponentially unlikely to be large), see [25 1. 
Since at most n bad branches are encountered before reaching level n, the routing complexity is bounded 
by a sum of n random variables with finite expectation and second moments, and hence is linear in n. □ 

A more natural example is the graph G TliP : For each pair of nodes u, v the edge (u, v) is open with 
probability p. In our setting it could be thought of as a faulty complete graph. It turns out that local routers 
can not do much better than querying all the edges: 

Theorem 10. Any local routing algorithm for the G n ^ p model where p = c/n (for c > 1) has an expected 
local routing complexity of at least Q(n 2 ). 

Proof. Assume we wish to route from u to v, and let X be the number of queries required. Let Ut be the set 
of vertices of the graph which are connected to u by paths known to be open after t queries. Thus Uq = {u}. 
Each vertex in Ut has probability p of being connected to v, thus the probability of finding a route while 
I Ut I < k is at most ph. 

To reach an additional vertex given that a set of vertices has been reached, the only option is to probe an 
edge connecting Ut to its complement. By symmetry all such edges are equivalent, and each has probability 
c/n of connecting to a new vertex. Thus \Ut\ — 1 is just a sum of — 1 random variables with expectation 

p = c/n. 
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Since u, v are both in the giant component with probability at least some a > 0, it follows that 



Pr [X <k}< P^fc > n ^ M + Fr i X < k \Uk< nVkp] 



< 



Pr[u ~ v] 
Pr[C/fe > n\fkp\ + p ■ n\fkp 
Pr[w ~ v] 

where the last inequality follows from Markov's inequality. This is close to for k = o(n 2 ) and shows that 
the average complexity is f)(n 2 ). □ 

Next we give tight upper and lower bounds on the oracle routing complexity. The next theorem implies 
that oracle routing in this case is better than local routing by a factor of exactly ^fn. 



Theorem 11. There exists a routing algorithm with average complexity 0(nyfn); Any algorithm succeeds 
with ariy/n queries with probability at most 0(ca 2//3 + cn _1 ). 

Proof. Let U t and V t be the sets of vertices reachable from u and v after t queries. For the upper bound, 
consider the following algorithm 

(1) Whenever there are unqueried edges between Vt and Ut, probe one of them, 

(2) Otherwise, pick the smaller of Ut, Vt and probe an unprobed edge connecting it to a previously un- 

reached vertex. 

(3) If no such edge exists, return that u ^ v. 

The algorithm is trivially correct. Since Ut and Vt grow by one vertex at a time, they are roughly of 
equal size. Since each edge is open with probability c/n, on average a connection between Ut and Vt will 
be found when \ Ut\ = \ Vt\ = 6(y/n). Since adding a vertex to either of the sets requires a number of queries 
with geometric distribution and mean n/c, it takes 0(n 3 / 2 ) queries to find a path from u to v. 

For the lower bound, note that \Ut U Vt\ < 2 + s t where St is the number of open edges found by time t, 
and E[st] = ct/n. Before a connection from u to v is found there must be an open (unprobed) edge between 

Ut and Vt, and the probability of that is at most < £(£i+ 2 L Thus for any algorithm A and any A 

i „ r i c(A + 2) 2 ct 2c(A 2 + 4) 

Pr \comp(A) <t]< Pr [s t > A + ^— — '— < — + —— '-. 

4n nX An 

If t = ar?l 2 , then setting A = t 1 / 3 results in 

3ca 2 / 3 2 



Vi[comp(A) < an 3/2 } < 



2 n 



□ 



6 Open Questions 

So far we observed that sometimes efficient routing is possible whenever the giant component exists, and 
sometimes the routing complexity has a phase transition at a different value of the percolation parameter. It 
is natural to assume that this phenomenon relates to the growth rate of the graph. In particular: 
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• Prove or refute: there exists a family of constant degree graphs in which: the diameter is logarithmic 
in the number of nodes, and the locations of the phase transition of percolation and routing coincide 
(at a location bounded away from 1). 

In particular it would be interesting to analyze De-Bruijn graphs, Shuffle-Exchange graphs, Butterflies and 
other families often used in the context of parallel computing. 

It would be interesting to see hardness results for oracle routers. Above we see that in the complete 
graph the best oracle router has complexity 6(n 3 / 2 ) where the giant component has diameter 0(logn). If p 
were a small power of n, then the diameter would be O(l) and the complexity would still be some power 
of n, and thus there is no bound for the complexity in terms of the diameter. The results of [ 3 1 suggest that 
oracle routing would not help in the hypercube. 

• Prove that for i < p < -4= the oracle routing complexity of the hypercube is exponential in n. 
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